Skip to content

Add task memory related session properties #14016

Merged
arhimondr merged 1 commit intoprestodb:masterfrom
aweisberg:memory_session_properties
Jan 30, 2020
Merged

Add task memory related session properties #14016
arhimondr merged 1 commit intoprestodb:masterfrom
aweisberg:memory_session_properties

Conversation

@aweisberg
Copy link
Contributor

@aweisberg aweisberg commented Jan 27, 2020

query_max_total_memory_per_node is required for a specific memory constrained use cases. query_max_memory_per_node is added for completeness.

== RELEASE NOTES ==

General Changes
* Added query_max_total_memory_per_node and query_max_memory_per_node session properties

@aweisberg
Copy link
Contributor Author

One question is whether these should be called max-memory-per-node or max-total-memory-per-node. This would be more consistent with the node configuration property names, but less accurate?

@aweisberg aweisberg force-pushed the memory_session_properties branch 2 times, most recently from fa0526c to 67c3e9a Compare January 27, 2020 20:04
@wenleix
Copy link
Contributor

wenleix commented Jan 27, 2020

The commit subject looks too long, here is the 7 rules: https://chris.beams.io/posts/git-commit/

- Separate subject from body with a blank line
- Limit the subject line to 50 characters
- Capitalize the subject line
- Do not end the subject line with a period
- Use the imperative mood in the subject line
- Wrap the body at 72 characters
- Use the body to explain what and why vs. how

Note sometimes it's too difficult to limit the subject line to 50 characters, but we still want to keep as short as possible (e.g. 60 characters )

What about Add task memory related session properties and explain what added in commit message ? :)

@aweisberg aweisberg force-pushed the memory_session_properties branch 2 times, most recently from 85c62da to 9b4dc18 Compare January 27, 2020 20:39
@rschlussel
Copy link
Contributor

I would call them max_memory_per_node for consistency. In general our session properties are snake case versions of the config properties. Naming them differently makes it hard to reason about what configuration properties are being overridden.

Copy link
Member

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % nit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static import both

@arhimondr
Copy link
Member

Please make sure the tests pass

@aweisberg aweisberg force-pushed the memory_session_properties branch 3 times, most recently from 5b9d5de to 4928880 Compare January 28, 2020 23:15
@wenleix wenleix changed the title Add query_max_task_memory and query_max_total_task_memory session pro… dd task memory related session properties Jan 29, 2020
@wenleix wenleix changed the title dd task memory related session properties Add task memory related session properties Jan 29, 2020
@aweisberg
Copy link
Contributor Author

@rschlussel made the requested changes, tests are passing.

Copy link
Contributor

@swapsmagic swapsmagic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we do consider max-query-memory and max-query-total-memory in ClusterMemoryManager to kill a query when it uses more memory. Do we want to do the same with this new session properties as well?
Also don't we need the default value for this session properties?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to mention task here? or just
Maximum amount of total (user + system) memory a query can use

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes otherwise the description makes it seem like this a distributed memory setting.

Copy link
Contributor

@swapsmagic swapsmagic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore the default value comment, realized it does have a default value.

@aweisberg
Copy link
Contributor Author

aweisberg commented Jan 29, 2020

Currently, we do consider max-query-memory and max-query-total-memory in ClusterMemoryManager to kill a query when it uses more memory. Do we want to do the same with this new session properties as well?

Per node memory limits are enforced at the node as the memory is claimed. Why does ClusterMemoryManager need to be involved?

Also you are quoting max-query-memory and max-query-total-memory, these are pre-existing session properties that already work and are consulted by ClusterMemoryManager. Did you mean the per-node variants.

@aweisberg aweisberg requested a review from swapsmagic January 29, 2020 19:09
@swapsmagic
Copy link
Contributor

Yes i meant the new per node memory properties. Didn't know that we already have some checks at node level. If it's already been done then it's fine.

@rschlussel
Copy link
Contributor

rschlussel commented Jan 29, 2020

A couple questions

  1. Can we remove the direct usage of the configuration property from SqlTaskManager?
  2. Should we throw an exception if someone tries to configure the session property greater than the config property so we don't run into problems with users configuring this inappropriately. (setting the maxMemoryPeroNode too high would also lead to a misconfigured reserved pool if the reserved pool is enabled)

@aweisberg
Copy link
Contributor Author

aweisberg commented Jan 29, 2020

  1. Can we remove the direct usage of the configuration property from SqlTaskManager?

Where would be the appropriate place to query the session?

  1. Should we throw an exception if someone tries to configure the session property greater than the config property so we don't run into problems with users configuring this inappropriately. (setting the maxMemoryPeroNode too high would also lead to a misconfigured reserved pool if the reserved pool is enabled)

Absolutely! Good catch.

@rschlussel
Copy link
Contributor

  1. Can we remove the direct usage of the configuration property from SqlTaskManager?

Where would be the appropriate place to query the session?

I mean that it looks like we set maxQueryUserMemoryPerNode in the query context directly from the config here:

DataSize maxQueryUserMemoryPerNode = nodeMemoryConfig.getMaxQueryMemoryPerNode();
DataSize maxQueryTotalMemoryPerNode = nodeMemoryConfig.getMaxQueryTotalMemoryPerNode();
DataSize maxQuerySpillPerNode = nodeSpillConfig.getQueryMaxSpillPerNode();
queryContexts = CacheBuilder.newBuilder().weakValues().build(CacheLoader.from(
queryId -> createQueryContext(queryId, localMemoryManager, nodeMemoryConfig, localSpillManager, gcMonitor, maxQueryUserMemoryPerNode, maxQueryTotalMemoryPerNode, maxQuerySpillPerNode)));

We shouldn't need that anymore since when we need that information we get it from the session property

Also, the nodeMemoryConfig that's getting passed in to createQueryContext isn't ever used.

@aweisberg aweisberg force-pushed the memory_session_properties branch from 4928880 to dc97d4d Compare January 29, 2020 21:28
@aweisberg
Copy link
Contributor Author

aweisberg commented Jan 29, 2020

I mean that it looks like we set maxQueryUserMemoryPerNode in the query context directly from the config here:

DataSize maxQueryUserMemoryPerNode = nodeMemoryConfig.getMaxQueryMemoryPerNode();
DataSize maxQueryTotalMemoryPerNode = nodeMemoryConfig.getMaxQueryTotalMemoryPerNode();
DataSize maxQuerySpillPerNode = nodeSpillConfig.getQueryMaxSpillPerNode();
queryContexts = CacheBuilder.newBuilder().weakValues().build(CacheLoader.from(
queryId -> createQueryContext(queryId, localMemoryManager, nodeMemoryConfig, localSpillManager, gcMonitor, maxQueryUserMemoryPerNode, maxQueryTotalMemoryPerNode, maxQuerySpillPerNode)));

We shouldn't need that anymore since when we need that information we get it from the session property

So the problem with removing the initial values from the query context construction is that we can technically construct the query context using only a task ID (no session). It seems safer to have an initial value for those based on the configuration rather then 0 or something hard coded.

(We talked about this in person just copying it here for posterity).

I removed NodeMemoryConfig from createQueryContext.

Add query_max_task_memory and query_max_total_task_memory session
properties
@aweisberg aweisberg force-pushed the memory_session_properties branch from dc97d4d to cd50167 Compare January 29, 2020 21:30
@arhimondr arhimondr merged commit fc3d1ee into prestodb:master Jan 30, 2020
@caithagoras caithagoras mentioned this pull request Feb 20, 2020
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants